منابع مشابه
Cross-Lingual Image Caption Generation
Automatically generating a natural language description of an image is a fundamental problem in artificial intelligence. This task involves both computer vision and natural language processing and is called “image caption generation.” Research on image caption generation has typically focused on taking in an image and generating a caption in English as existing image caption corpora are mostly ...
متن کاملTopic-Specific Image Caption Generation
Recently, image caption which aims to generate a textual description for an image automatically has attracted researchers from various fields. Encouraging performance has been achieved by applying deep neural networks. Most of these works aim at generating a single caption which may be incomprehensive, especially for complex images. This paper proposes a topic-specific multi-caption generator, ...
متن کاملImage Caption Generation with Recursive Neural Networks
The ability to recognize image features and generate accurate, syntactically reasonable text descriptions is important for many tasks in computer vision. Auto-captioning could, for example, be used to provide descriptions of website content, or to generate frame-by-frame descriptions of video for the vision-impaired. In this project, a multimodal architecture for generating image captions is ex...
متن کاملDeepDiary: Automatic Caption Generation for Lifelogging Image Streams
Lifelogging cameras capture everyday life from a firstperson perspective, but generate so much data that it is hard for users to browse and organize their image collections effectively. In this paper, we propose to use automatic image captioning algorithms to generate textual representations of these collections. We develop and explore novel techniques based on deep learning to generate caption...
متن کاملGuiding Long-Short Term Memory for Image Caption Generation
In this work we focus on the problem of image caption generation. We propose an extension of the long short term memory (LSTM) model, which we coin gLSTM for short. In particular, we add semantic information extracted from the image as extra input to each unit of the LSTM block, with the aim of guiding the model towards solutions that are more tightly coupled to the image content. Additionally,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Neurocomputing
سال: 2019
ISSN: 0925-2312
DOI: 10.1016/j.neucom.2018.10.059